Fractional Regression Nearest Neighbor Imputation
نویسندگان
چکیده
Sample surveys typically gather information on a sample of units from a finite population and assign survey weights to the sampled units. Survey frequently have missing values for some variables for some units. Fractional regression imputation creates multiple values for each missing value by adding randomly selected empirical residuals to predicted values. Fractional imputation methods assign fractional survey weights to the imputed values. Fractional nearest neighbor imputation randomly selects multiple donors for each missing value from a set of nearest neighbors. The fractional regression nearest neighbor imputation method developed in this paper imputes more than one value for each missing item using donors that are neighbors selected by a distance calculation involving both regression model predictions and variables used in other nearest neighbor methods. Different distance function specifications, which can involve both observed and predicted values, produce alternative imputation procedures. In this paper, we compare the performance of fractional imputation methods, including fractional regression nearest neighbor imputation, in a simulation study. In addition, we examine empirically the performance of the imputation methods studied in this paper on a subset of data from the Iowa Family Transitions Project under different missing data assumptions.
منابع مشابه
A comparison study of nonparametric imputation methods
Consider estimation of a population mean of a response variable when the observations are missing at random with respect to the covariate. Two common approaches to imputing the missing values are the nonparametric regression weighting method and the Horvitz-Thompson (HT) inverse weighting approach. The regression approach includes the kernel regression imputation and the nearest neighbor imputa...
متن کاملBiases and Variances of Survey Estimators Based on Nearest Neighbor Imputation
NEAREST NEIGHBOR IMPUTATION Jiahua Chen1 University of Waterloo Jun Shao2 University of Wisconsin-Madison Abstract Nearest neighbor imputation is one of the hot deck methods used to compensate for nonresponse in sample surveys. Although it has a long history of application, theoretical properties of the nearest neighbor imputation method are unknown prior to the current paper. We show that unde...
متن کاملAn Effective Technique of Multiple Imputation in Nonparametric Quantile Regression
In this study, we consider the nonparametric quantile regression model with the covariates Missing at Random (MAR). Multiple imputation is becoming an increasingly popular approach for analyzing missing data, which combined with quantile regression is not well-developed. We propose an effective and accurate two-stage multiple imputation method for the model based on the quantile regression, whi...
متن کاملVariance Estimation for Nearest Neighbor Imputation for U.s. Census Long Form Data
Variance estimation for estimators of state, county, and school district quantities derived from the Census 2000 long form are discussed. The variance estimator must account for (1) uncertainty due to imputation, and (2) raking to census population controls. An imputation procedure that imputes more than one value for each missing item using donors that are neighbors is described and the proced...
متن کاملImproved methods for the imputation of missing data by nearest neighbor methods
Missing data is an important issue in almost all fields of quantitative research. A nonparametric procedure that has been shown to be useful is the nearest neighbor imputation method. We suggest a weighted nearest neighbor imputation method based on Lq-distances. The weighted method is shown to have smaller imputation error than available NN estimates. In addition we consider weighted neighbor ...
متن کامل